Environment normalization for robust speech recognition using direct cepstral comparison

نویسندگان

  • Fu-Hua Liu
  • Richard M. Stern
  • Alex Acero
  • Pedro J. Moreno
چکیده

In this paper we describe and evaluate a series of new algorithms that compensate for the effects of unknown acoustical environments (or changes in environment) through the use of compensation vectors that are added to the cepstral representations of speech that is input to a speech recognition system. These compensation vectors are obtained from direct frame-by-frame comparisons of the cepstral representations of speech that is simultaneously recorded in the training environment and various testing environments, but the algorithms do not make use of such “stereo” speech data in analyzing speech from an unknown environment. In the proposed paper we will compare the improvement in recognition accuracy provided by the algorithms using common standard ARPA speech recognition corpora. For example, the normalization algorithm known as MFCDCN provided a 22% reduction in word error rate when compared to results obtained using cepstral mean normalization on the 1992 ARPA WSJ/CSR corpus, and a 56.6% reduction in error rate compared to baseline processing. A family of new algorithms, PDCN, which accomplish the environment normalization inside the decoder are described and evaluated in the same corpus. A substantial word error rate reduction, 66.8%, can be achieved by combining MFCDCN and PDCN in the system with cepstral mean normalization compared to baseline system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Powered cepstral normalization (p-CN) for robust features in speech recognition

Cepstral normalization has been popularly used as a powerful approach to produce robust features for speech recognition. Good examples of approaches in this family include the well known Cepstral Mean Subtraction (CMS) and Cepstral Mean and Variance Normalization (CMVN), in which either the first or both the first and the second moments of the Mel-frequency Cepstral Coefficients (MFCCs) are nor...

متن کامل

A New Data Driven Method for Robust Speech Recognition

The conventional view on the problem of robustness in speech recognition is that performance degradation in ASR systems is due to mismatch between training and test conditions. If problem of robustness in ASR systems were considered as a mismatch between the training and testing conditions the solution would be to find a way to reduce it. Common approaches are: Data-Driven methods such as speec...

متن کامل

Augmented Cepstral Normalization for Robust Speech Recognition

We proposed an augmented cepstral mean normalization algorithm that differentiates noise and speech during normalization, and computes a different mean for each. The new procedure reduced the error rate slightly for the case of sameenvironment testing, and significantly reduced the error rate by 25% when an environmental mismatch exists over the case of standard cepstral mean normalization.

متن کامل

Evaluation and Modification of Cepstral Moment Normalization for Speech Recognition in Additibe Babble Ensemble

The statistical properties of a speech feature could differ under the influence of noisy environments. These effects are common in mismatched environments such as additive background noise and reverberant environments. Normalization strategies are employed in speech recognition systems to compensate for the effects of environmental mismatch. This work explores the utilization of cepstral moment...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994